{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "39354318",
   "metadata": {},
   "source": [
    "# Visual Question Answering\n",
    "\n",
    "This example will show you how to use GPTCache and Replicate to implement question answering about images, it uses BLIP model to answer free-form questions about images in natural language. Where the Replicate will be used to return the answer, and GPTCache will cache the generated answer so that the next time the same or similar question about the image is requested, it can be returned directly from the cache, which can improve efficiency and reduce costs.\n",
    "\n",
    "This bootcamp is divided into three parts: how to initialize gptcache, running the Replicate model to get the answer, and finally showing how to start the service with gradio. You can also try this example on [Google Colab](https://colab.research.google.com/drive/1W6dQfkX9p8cMfdIuWBVPi-iru7OlGPSH?usp=share_link)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b7abc8f5",
   "metadata": {},
   "source": [
    "## Initialize the gptcache\n",
    "\n",
    "Please [install gptcache](https://gptcache.readthedocs.io/en/latest/index.html#) first, then we can initialize the cache. There are two ways to initialize the cache, the first is to use the map cache (exact match cache) and the second is to use the database cache (similar search cache), it is more recommended to use the second one, but you have to install the related requirements.\n",
    "\n",
    "Before running the example, make sure the `REPLICATE_API_TOKEN` environment variable is set by executing `echo $REPLICATE_API_TOKEN`. If it is not already set, it can be set by using `export REPLICATE_API_TOKEN=YOUR_API_TOKEN` on Unix/Linux/MacOS systems or `set REPLICATE_API_TOKEN=YOUR_API_TOKEN` on Windows systems."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f78f8d1b",
   "metadata": {},
   "source": [
    "### 1. Init for exact match cache\n",
    "\n",
    "`cache.init` is used to initialize gptcache, the default is to use map to search for cached data, `pre_embedding_func` is used to pre-process the data inserted into the cache, and it will use the `get_input_str` method, more configuration refer to [initialize Cache](https://gptcache.readthedocs.io/en/latest/references/gptcache.html#module-gptcache.Cache)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "9cd436f4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# from gptcache import cache\n",
    "# from gptcache.processor.pre import get_input_str\n",
    "# # init gptcache\n",
    "# cache.init(pre_embedding_func=get_input_str)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "f88b92d6",
   "metadata": {},
   "source": [
    "### 2. Init for similar match cache\n",
    "\n",
    "When initializing gptcahe, the following four parameters are configured:\n",
    "\n",
    "- `pre_embedding_func`: \n",
    "\n",
    "   Pre-processing before extracting feature vectors, it will use the `get_input_image_file_name` method.\n",
    "\n",
    "- `embedding_func`:\n",
    "\n",
    "   The method to extract the image feature vector, you can refer to [gptcache.embedding](https://gptcache.readthedocs.io/en/latest/references/embedding.html) for options of image embedding methods.\n",
    "\n",
    "- `data_manager`:\n",
    "\n",
    "   DataManager for cache management. It is used for image feature vector, question and response answer in the example, it takes [Milvus](https://milvus.io/docs) (please make sure it is started), you can also configure other vector storage, refer to [VectorBase API](https://gptcache.readthedocs.io/en/latest/references/manager.html#module-gptcache.manager.vector_data).\n",
    "\n",
    "- `similarity_evaluation`:\n",
    "\n",
    "   The evaluation method after the cache hit. It evaluates the similarity between the current question and questions of cache hits. In this case, you can select `ExactMatchEvaluation`, `OnnxModelEvaluation`, `NumpyNormEvaluation` from [gptcache.similarity_evaluation](https://gptcache.readthedocs.io/en/latest/references/similarity_evaluation.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "5bb28fb8",
   "metadata": {},
   "outputs": [],
   "source": [
    "from gptcache import cache\n",
    "from gptcache.adapter import openai\n",
    "from gptcache.processor.pre import get_input_image_file_name\n",
    "\n",
    "from gptcache.embedding import Timm\n",
    "from gptcache.similarity_evaluation.onnx import OnnxModelEvaluation\n",
    "from gptcache.manager import get_data_manager, CacheBase, VectorBase\n",
    "\n",
    "\n",
    "timm = Timm()\n",
    "cache_base = CacheBase('sqlite')\n",
    "vector_base = VectorBase('milvus', host='localhost', port='19530', dimension=timm.dimension)\n",
    "data_manager = get_data_manager(cache_base, vector_base)\n",
    "\n",
    "cache.init(\n",
    "    pre_embedding_func=get_input_image_file_name,\n",
    "    embedding_func=timm.to_embeddings,\n",
    "    data_manager=data_manager,\n",
    "    similarity_evaluation=OnnxModelEvaluation(),\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "266888ef",
   "metadata": {},
   "source": [
    "## Run replicate blip\n",
    "\n",
    "Then run `replicate.run`, which will use blip model to answer free-form questions about images in natural language.\n",
    "\n",
    "Note that `replicate` here is imported from `gptcache.adapter.replicate`, which can be used to cache with gptcache at request time. Please download the [merlion.png](https://github.com/salesforce/LAVIS/raw/main/docs/_static/merlion.png) before running the following code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "a4e83890",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "singapore\n"
     ]
    }
   ],
   "source": [
    "from gptcache.adapter import replicate\n",
    "\n",
    "output = replicate.run(\n",
    "            \"andreasjansson/blip-2:4b32258c42e9efd4288bb9910bc532a69727f9acd26aa08e175713a0a857a608\",\n",
    "            input={\"image\": open(\"./merlion.png\", \"rb\"),\n",
    "                   \"question\": \"Which city is this photo taken?\"}\n",
    "        )\n",
    "print(output)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15f7734a",
   "metadata": {},
   "source": [
    "## Start with gradio\n",
    "\n",
    "Finally, we can start a gradio application to answer the questions about images. First define the `vqa` method, then start the service with gradio, as shown below:\n",
    "\n",
    "![](../assets/vqa.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "319fcb7d",
   "metadata": {},
   "outputs": [],
   "source": [
    "def vqa(img, question):\n",
    "    output = replicate.run(\n",
    "            \"andreasjansson/blip-2:4b32258c42e9efd4288bb9910bc532a69727f9acd26aa08e175713a0a857a608\",\n",
    "            input={\"image\": open(img, \"rb\"),\n",
    "                   \"question\": question}\n",
    "        )\n",
    "    return output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "65b9134a",
   "metadata": {},
   "outputs": [],
   "source": [
    "import gradio\n",
    "\n",
    "interface = gradio.Interface(vqa, \n",
    "                             [gradio.Image(source=\"upload\", type=\"filepath\"), gradio.Textbox(label=\"Question\")],\n",
    "                             gradio.Textbox(label=\"Answer\")\n",
    "                            )\n",
    "\n",
    "interface.launch(inline=True)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
